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Abstract 

The field of early intervention is currently faced with the challenge of reducing the prevalence 
of antisocial behavior in children. Longitudinal outcomes research indicates that increased 
antisocial behavior and impairments in social competence skills during the preschool years often 
serve as harbingers of future adjustment problems in a number of domains including mental 
health, interpersonal relations, and academic achievement. This article reports the results 
of a cross-site randomized controlled trial, in which 128 preschool children with challenging 
behaviors were assigned to either a Preschool First Step to Success (PFS) intervention 
(i.e., experimental) or a usual-care (i.e., control) group. Regression analyses indicated that 
children assigned to the Preschool First Step intervention had significantly higher social skills, 
and significantly fewer behavior problems, across a variety of teacher- and parent-reported 
measures at postintervention. Effect sizes for teacher-reported effects ranged from medium to 
large across a variety of social competency indicators; effect sizes for parent-reported social 
skills and problem behaviors were small to medium, respectively. These results suggest that 
the preschool adaptation of the First Step intervention program provides early intervention 
participants, staff, and professionals with a viable intervention option to address emerging 
antisocial behavior and externalizing behavior disorders prior to school entry. 
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Introduction 


Developing and disseminating evidence-based interventions for children in preschool and the 
primary elementary grades for promoting child social-emotional development have emerged as a 
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high priority for schools (see Detrich, Keyworth, & States, 2008; Domitrovich, Moore, & 
Greenberg, 2012; Dunlap & Fox, 2011; Dunlap, Smith, Fox, & Blase, 2014; Hoagwood et al., 
2004; Hoagwood et al., 2007). In part, this is due to substantial evidence suggesting that early 
intervention has protective qualities and public health benefits well into adulthood (Hawkins, 
Kosterman, Catalano, Hill, & Abbott, 2005; Reynolds, Temple, Robertson, & Mann, 2001). 
Children who display behavior problems in preschool are likely to continue displaying them in 
elementary school and are at significantly higher risk for ongoing problem behavior and long- 
term detrimental outcomes (Bulotsky-Shearer, Dominguez, Bell, Rouse, & Fantuzzo, 2010; 
Caspi & Moffitt, 1995; Odgers et al., 2008). National survey results indicate that preschool-age 
children are expelled at 3 times the rate of K-12 students (Gilliam & Shahar, 2006). 

Not surprisingly, younger children in school settings who display challenging behavior pat- 
terns severely stress the management skills of teachers (Powell, Fixsen, Dunlap, Smith, & Fox, 
2007). The scope of this problem is reflected in epidemiological findings suggesting that children 
diagnosed with Emotional or Behavioral Disorders (EBD) comprise about 10% of overall pre- 
school samples (Forness, Freeman, Paparella, Kauffman, & Walker, 2012) while /ow-income 
preschool samples often have an EBD prevalence of 20% or more (Qi & Kaiser, 2003). Such 
numbers can have a profound impact on the preschool classroom’s ecology, particularly as man- 
dated mental health or special education services are rarely available in preschool settings 
(Forness, Kim, & Walker, 2012; Powell et al., 2007). 

This problem is compounded by the fact that many more children in preschool settings are at 
risk for EBD than are actually diagnosed (Brown, Odom, & McConnell, 2008; Bulotsky-Shearer 
et al., 2010; Odgers et al., 2008). There is, in fact, considerable evidence that early childhood 
onset of EBD, as opposed to later childhood or adolescent onset, is more likely to result in a 
persistent disorder that is more severe, less responsive to standard interventions, and more likely 
to result in long-term negative outcomes (Moffitt, 2008; Moffitt & Caspi, 2001). The longer term 
outcomes of EBD are well documented via individual problems experienced in academic devel- 
opment, peer-related adjustment, conflict with authority, and a host of other difficulties that 
extend well beyond the school years (Brennan, Shaw, Dishion, & Wilson, 2012; Campbell, 
Spieker, Burchinal, & Poe, 2006; Odgers et al., 2008). The developmental psychopathology of 
these early disorders, moreover, typically involves relatively trivial oppositional or acting out 
behaviors initially, such as minor peer adjustment problems, noncompliance to parental direc- 
tives, or slight tendencies toward disengagement in daily routines at home. If managed poorly, 
they may then begin to be behaviorally expressed in day care or preschool settings, becoming 
increasingly more serious or disruptive over the preschool years. These trajectories have been 
well documented empirically (see Luby et al., 2012; Nock, Kazdin, Hiripi, & Kessler 2007; Reid, 
Patterson, & Snyder, 2002; Shaw, Owens, Giovannelli, & Winslow, 2001; Webster-Stratton & 
Taylor, 2001). 

For this reason, there has been a growing appreciation for multitiered models for preschool 
prevention and intervention that disrupt or offset these trajectories. These approaches begin with 
“primary” or universal strategies to support a positive and predictable class-wide environment, 
“secondary” strategies to target children who begin to show evidence of risk behaviors, and “ter- 
tiary” strategies for children with diagnosable disorders who require more intensive interventions 
to prevent their disorders from getting worse (Branson & Demchak, 2011; Dunlap & Fox, 2011; 
Fox, Carta, Strain, Dunlap, & Hemmeter, 2010; Hemmeter, Snyder, Kinder, & Artman, 2011; 
Walker et al., 1996). 

At the primary prevention or universal tier, there are various programs of classroom-wide 
positive behavioral support that emphasize modeling and praise for specified student positive 
behavior (Conroy, Dunlap, Clarke, & Alter, 2005; Fullerton, Conroy, & Correa, 2009; Stormont, 
Smith, & Lewis, 2007). At the secondary prevention tier, there are such programs as (a) the Dina 
Dinosaur curriculum component of the Incredible Years program that focuses on a series of 
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vignettes for social-emotional learning (Webster-Stratton, Reid, & Hammond, 2004), (b) the 
Preschool Promoting Alternative Thinking Strategies (PATHS) curriculum that stresses self- 
regulation and problem solving (Domitrovich, Cortes, & Greenberg, 2007), and (c) the second 
tier of the Teaching Pyramid Model that uses a social-emotional assessment tool to identify pre- 
schoolers who need more specific learning opportunities that can be embedded in their daily 
activities (Hemmeter & Fox, 2009). The tertiary prevention tier includes such interventions as 
the Parent-Child Interaction Therapy (PCIT) program that uses clinical coaching during parent- 
and child-directed interactions to enhance their relationship (Zisser & Eyberg, 2010) and the 
Regional Intervention Program (RIP) that places parents in the classroom setting as intervention- 
ists while successful parent graduates of RIP act as coaches (Strain & Timm, 2001). The reader 
should note that there are several other such programs at both the second and third tiers that pri- 
marily involve parents of young children in the home, such as Nurse—Family Partnership 
(Eckenrode et al., 2010), but the emphasis in this article is primarily on center-based 
interventions. 

Joseph and Strain (2003) reviewed 10 such social-emotional curricula along nine dimensions: 
treatment fidelity, treatment generalization, treatment maintenance, social validity, acceptability 
of interventions, replication across investigators, replication across clinical populations, evi- 
dence for ethnic/racial diversity, and replication across settings. Of the 10 programs reviewed, 
only First Step to Success (Walker et al., 1998) and The Incredible Years: Dinosaur School pro- 
gram (Webster-Stratton, 1998; Webster-Stratton, Reid, & Hammond, 2001) received a high con- 
fidence rating (i.e., seven of the nine rating criteria having been met). 

The research project described herein focused on evaluating a downward adaptation of this 
intervention developed originally for primary grade students. First Step to Success (Walker et al., 
1998) is a collaborative home and school early intervention designed to assist behaviorally at- 
risk children in getting off to a good start in their school careers (Walker et al., 1997; Walker 
et al., 2014). First Step is classified as a secondary-level intervention that uses in-classroom 
coaching of teachers to cue sustained engagement in prosocial and adaptive activities using a 
reinforcement system that is designed to enhance the target child’s social desirability and peer 
interactions. It targets children who enter elementary school not ready to learn, many of whom 
bring very challenging behavior patterns with them. This selected intervention forges a home and 
school partnership in which the teacher, the child’s parents, and the First Step behavioral coach 
work together in teaching the target child school success skills and a prosocial behavior pattern 
that fosters friendship making. As noted above, the First Step early intervention program was 
developed originally for application with behaviorally challenged students enrolled in the pri- 
mary grades, but the present study extends this downward into the preschool years (Walker et al., 
1997, 1998; Walker et al., 2009). 

The aim of the research reported herein was to determine, via a randomized controlled trial, 
whether the efficacy of the Preschool Adaptation of First Step to Success can be documented by 
improvements in child behavior and social skills outcomes of preschool children who are at high 
risk for the development of oppositional or conduct disorders. A secondary goal was to examine 
its impact on teacher and parent program adherence and satisfaction through extensive fidelity- 
of-treatment measures. 


Method 


Participants 


After securing Human Subjects Institutional Review Board approval, the primary investigators 
and lead program trainers recruited preschool programs in implementation sites located in 
Oregon, Indiana, and Kentucky. We obtained consent to conduct the study from the program 
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Eligible classrooms with at least one consented teacher (n = 149) 
Fall 2009, 20 in OR & 35 in KY/IN 
Fall 2010, 29 in OR & 21 in KY/IN 
Fall 2011, 19 in OR & 25 in KY/IN 


Classrooms excluded from screening (n = 11) 
- Teacher declined continued participation (n = 5) 
- Teacher declined to complete screening (n = 4) 
- Teacher ineligible (n = 2) 


Teachers (n = 138) completed nomination and rank ordering of externalizing students (n = 625) 


Classrooms excluded from randomization (n = 12) 
- Teacher declined continued participation (n = 7) 
- Could not obtain parent consent for eligible 
student (n = 5) 


Randomized one consented target student from each classroom (n = 126) 


y y 


Allocated to wait-list control (n = 61) Allocated to intervention (n = 65) 
Lost to post data collection (n = 2) Lost to post data collection (n = 0) 
Analyzed sample (n = 59) Analyzed sample (n = 65) 


Figure I. Schematic overview of participation and sample definition through screening, consent, 
randomization, and data collection intervals. 


directors of 32 Head Start and preschool programs in two counties in Oregon and 31 Head Start 
and preschool programs located in two counties in Kentucky and Indiana. We then gave brief 
presentations to school personnel at these sites to recruit individual teachers as study participants. 
As noted below in Figure 1, this recruitment was done in three cohorts; thus, the process took 
place over a 3-year period on three separate occasions. We invited teachers from 149 Head Start, 
state-funded, tuition-based, and private preschool classrooms to enter the study across the 3 
years. If there was more than one teacher in a classroom, both teachers were invited to partici- 
pate; however, only the lead teacher’s data were used in the analyses presented herein. Across 
three cohorts, 138 of 149 consented teachers (93%) participated in the screening and student 
recruitment phase of the study (see Figure 1). 

Prior to screening, teachers distributed a waiver of consent letter to the parents of each student 
in his or her classroom. The waiver of consent letter explained the proposed study, described the 
class-wide screening procedure, and detailed steps for declining participation in the screening 
process. Parents who declined participation returned a prepaid postcard to the teacher within 2 
weeks. If the postcard was returned, the child was excluded from the screening process. 
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Participating teachers completed an adapted version of the Early Screening Project (ESP; 
Walker, Severson, & Feil, 1995). At Screening Stage 1, teachers were given a detailed descrip- 
tion of externalizing behavior and asked to nominate and rank-order five children in their 
classrooms who most closely matched the description of student behavioral characteristics 
provided them. Teachers then completed three Stage-2 ESP rating scales—the Adaptive 
Behavior Index (ABI), Maladaptive Behavior Index (MBI), and Aggressive Behavior Scale 
(ABS)—for each of the children identified in the previous stage. These scales are described in 
greater detail in the Pre-/Postoutcome Measures section. The 138 teachers who participated in 
the screening procedure completed Stage-2 rating scales for 625 students (IM = 4.5 students per 
classroom). For each scale, we converted total scores to severity scores corresponding to | SD, 
1.5 SD, and 2 SD from the normative mean (Feil, Severson, & Walker, 1998). Severity scores 
for each scale ranged from 0 (within 1 SD of mean) to 3 (2+ SD from mean). We summed the 
three severity scores (range = 0-9) and rank-ordered the five nominated students within each 
classroom. Children with a scale score of at least 1 SD above the mean met eligibility criteria. 
Of those, we recruited only one student from each classroom to participate in the study, due to 
limits imposed by the scope of the design and data collection costs. Forty-four of the 625 
screened students (7%) did not meet eligibility criteria and were excluded from the parent 
recruitment procedures. 

Within each classroom, project staff rank-ordered the remaining 581 eligible students accord- 
ing to severity, and invited parents of the highest ranked child in each classroom to participate in 
the study. If the parents of the highest ranked child declined, project staff contacted the parents 
of the next highest ranked child in the classroom. This process was repeated until either parent 
consent was obtained for | eligible child in each classroom or the families of all eligible children 
had declined participation. After screening, 7 teachers declined continued participation in the 
study and project staff members were unable to obtain consent from the parents of any eligible 
students in five additional classrooms. Thus, we randomized 126 of the 149 recruited classrooms 
(85%) with | student and | teacher from each classroom to either a Preschool First Step interven- 
tion or usual-care control group condition. 

The classroom was our unit of randomization. Teachers could participate in the study only 1 
time and were not allowed to be re-recruited. There were no exclusion rules for teachers or class- 
rooms. Classroom settings were 61% Head Start, 41% state-funded preschools, and 6% private 
preschools. As reported in Table 1, participating children had a mean age of 4 years, were pre- 
dominantly male (65%), and African American (31%), Caucasian (44%), or Hispanic (5%). 
Participating teachers were primarily female (99%) and were African American (18%) or 
Caucasian (72%). Teachers reported having taught for an average of 14 years (SD = 9.2). 
Education levels varied with 22% reporting having a high school diploma, 33% an associate’s 
degree, 23% a bachelor’s degree, and 22% a master’s degree or higher. Baseline equivalence 
across conditions and cohorts is discussed in the “Results” section. 


Usual-Care Control Condition 


Teachers in classrooms randomized to the usual-care control group received a half-day of train- 
ing in general classroom management strategies and the principles of positive behavior support. 
The half-day workshop training in the universal principles of classroom management was based 
on principles of Positive Behavior Support (Golly, 2006; Sprague & Golly, 2013). Strategies for 
promoting a positive classroom environment, including positively reinforcing appropriate behav- 
iors, were described. Teachers participated in discussions of their experiences with positive 
behavior support. This workshop was intended to provide some degree of intervention and 
enhance equivalency between groups, but was in essence much more general in nature (and 
lacked specific intervention strategies) than that given in the first half of the training for teachers 
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Table I. Baseline Equivalence of Demographic Characteristics and Screening Measures. 


Total Control Intervention Test 
Item (N = 126) (n= 61) (n = 65) statistic  p value 
Demographic characteristic 
Age, M (SD) 4.1 (0.4) 4.1 (0.4) 4.0 (0.4) 0.78 436 
% female 44 (34.9) 24 (39.3) 20 (30.8) 1.02 313 
% African American 39 (31.0) 16 (26.2) 23 (35.4) 1.23 .267 
% Caucasian 56 (44.4) 27 (44.3) 29 (44.6) 0.01 .968 
% Hispanic 16 (12.7) 11 (18.0) 5 (7.7) 3.04 .08 | 
Screening measures 
ESP rank 0.25 884 
% ranked first 95 (75.4) 45 (73.8) 50 (76.9) 
% ranked second 22 (17.5) 11 (18.0) 11 (16.9) 
% ranked third 9 (7.1) 5 (8.2) 4 (6.2) 
Aggressive Behavior Scale, M (SD) 22.3 (5.8) 22.8 (5.9) 21.8 (5.7) 0.94 347 
Adaptive Behavior Index, M (SD) 21.8 (4.8) 22.3 (5.0) 21.4 (4.6) 0.97 .334 
Maladaptive Behavior Index, M (SD) 30.9 (6.1) 31.4 (5.5) 30.4 (6.6) 0.96 34! 


Note. Reported test statistics are t for continuous and chi-square for dichotomous measures. ESP = Early Screening 
Project. 


in the experimental intervention group described below. Teachers in the usual-care control group 
were eligible for specific training in the Preschool First Step program beginning the following 
academic year. 


Experimental Condition 


The Preschool First Step to Success (PFS) intervention includes a daylong workshop training 
session in which the universal principles of classroom management are taught (Golly, 2006; 
Sprague & Golly, 2013) along with training in the PFS intervention. In the first half of the work- 
shop, teachers were taught to (a) develop behavior expectations (i.e., rules); (b) create strategies 
to teach these behavioral expectations to their preschool students through use of examples and 
nonexamples, feedback, and debriefing processes; (c) make plans to positively reinforce the 
behavioral expectations including use of formal motivational systems (e.g., charts, graphs, and 
group reward activities); and (d) review classroom organization to provide routines for entering 
and exiting, transitions, and quiet-time areas. In the second half of the daylong workshop, teach- 
ers learned about First Step to Success’s classroom and home components. Adaptations to the 
elementary school version of First Step for preschool-age children are described below, but for 
more detail, see Frey et al. (2013), Feil et al. (2009), and Frey, Boyce, and Tarullo (2009). A 
behavioral coach followed up with each participating teacher and was available for one-on-one 
consultation in his or her respective classroom during instructional hours. 


Classroom-based component. Implementation of the PFS classroom component has three phases: 
(a) the coach phase, (b) the teacher phase, and (c) maintenance. The classroom intervention com- 
ponent teaches the participating child an adaptive behavior pattern that enhances school success 
as well as friendship making skills for the improvement of peer relations. After the first 10 days 
of the PFS program “coach phase” (i.e. first 10 program days) in which coaches work directly 
with the child and model the correct implementation procedure for the teacher, he or she then 
assumes primary program responsibility (“teacher phase’), which lasts 20 days. The PFS coach 
assumes a supervisory and trouble shooting role for the remainder of the program. 
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The First Step to Success program provides feedback to the student using a green and red 
card—green displayed by the coach or teacher for positive classroom behavior and red for nega- 
tive behavior. Group dependent contingencies are used to motivate the participating child and 
peers at school, and individual contingencies and home rewards provide incentives for mastery 
of school success skills at home along with their display in school contexts. When a reward cri- 
terion is met in the classroom, as determined by the teacher and coach, the participating child 
earns a brief activity reward (e.g., classroom game, extra recess) for peers. The participating 
child selects an individualized reward from a menu of home rewards preapproved by his or her 
parents. The PFS focus student (i.e., participating student) receives points and praise for engag- 
ing in appropriate classroom behavior (e.g., following classroom rules, cooperating, sharing, 
sitting quietly and attentively during circle time). 

Note that the classroom component of the intervention for this study was modified for younger 
children in the preschool setting via (a) classroom management training and (b) increasing the 
coach’s time with the child. Before one-on-one intervention starts with the target child, the coach 
and teacher identify general positive classroom management strategies organized around the five 
universal principles of positive behavior support that are central to First Step: (a) establish clear 
expectations, (b) teach the expectations, (c) reinforce the expectations, (d) minimize attention for 
minor inappropriate behaviors, and (e) enforce clear consequences for unacceptable behavior 
(Feil et al., 2009; Sprague & Golly, 2013). We have found in this downward extension of First 
Step that younger children require additional practice in understanding and mastering these 
behavioral skills and expectations. As a result, the coach role-plays with the child before each 
implementation session. The Preschool First Step coach provides more supervision and problem- 
solving more often during the intervention than is the case for the regular First Step program. For 
example, if the child’s behavior is inappropriate and he or she does not respond to feedback, the 
coach determines whether the child understands the expectations and, if not, takes that opportu- 
nity to role-play one-on-one the expected behavior in a quiet place and encourages the child to 
comply. These modifications often result in a longer coach phase (up to 10 days) compared with 
the elementary version (5 days). 


Home-based component (homeBase). Over a 6- to 8-week period, parents meet weekly with the 
First Step coach, usually in their home, to learn how to teach the school success skills via reading, 
discussion, role-plays, and demonstrations. Each week’s parent-coach meeting focuses on one 
skill, with review and discussion of previously learned skills as needed. The specific homeBase 
skills taught are: communication and sharing, cooperation, limits setting, problem solving, 
friendship making, and self-confidence. Parents are provided with a manual containing all the 
information and accompanying materials needed to implement homeBase. These materials pro- 
vide a useful reference guide for parents, caregivers, and the coach during and following the PFS 
program. The coach provides support, supervision, and trouble shooting of any problems and 
issues that arise during and following the program’s implementation and also serves as a com- 
munication bridge between the teacher and school. In addition, two modifications were made to 
the home component in this study. First, parent meetings were conducted with the child present 
so that the coach could model for the parents how to interact positively with the child during 
completion of program activities. Second, the home component began earlier in the intervention 
timeline. In the Grades K-3 version, the home component begins after Day 10, whereas in the 
preschool version, the home component starts after Day 2. 


First Step Implementation 


First Step was implemented under the guidance of a “coach.” First Step coaches were employees 
of Oregon Research Institute or the University of Louisville. Each site employed eight coaches. 
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All coaches had a bachelor’s degree or higher. Coaches attended a 2-day training session during 
which they received intensive training on First Step program implementation using various inter- 
active activities. Coaches role-played (a) conducting consent meetings with parents, (b) meeting 
with the focus student, (c) introducing the program to the class, and (d) implementing both the 
first program day of the school intervention and the homeBase module. Coaches also learned 
problem-solving strategies and how to use the daily summary chart and timing device to track 
awarding praise and earned points. During implementation, clinical supervisors monitored 
coaches closely and frequent fidelity checks were conducted to ensure program implementation 
quality. At each site, coaches attended weekly meetings with lead implementers to discuss and 
troubleshoot cases. The lead interventionist from the Oregon site trained staff from the PFS 
implementation sites and also participated in weekly meetings via conference calls to promote 
implementation consistency across sites. 


Data Collection Procedures 


Prior to PFS randomization, training, and implementation, project staff distributed baseline ques- 
tionnaire packets to teachers and parents. Packets were sent by mail or hand-delivered to partici- 
pants. We provided a postage-paid envelope for returning questionnaires and offered to pick up 
packets if needed. We distributed postintervention questionnaire packets using the same proce- 
dures. For intervention students, packets were distributed after completion of the First Step inter- 
vention. To approximate an equivalent window of time between baseline and postintervention 
data collection for the usual-care control condition, we used data from ESP Stage 2 screening 
scale to yoke each child in the control group to a child in the intervention group. The average 
number of days between baseline and postintervention data collection did not differ between 
conditions, #(122) = 0.87, p = .386. For intervention students, post packets were collected an 
average of 128 days (SD = 28.6) after the baseline assessment; for students in the usual-care/ 
control group, post packets were collected an average of 133 days (SD = 28.1) following base- 
line. Parents and teachers were each paid US$50 for the questionnaire packet they returned (i.e., 
screening, baseline, and follow-up data packets). Spanish-speaking parents were given the option 
to complete questionnaires in Spanish. Eight parents (6%) completed Spanish versions of the 
questionnaires. 


Pre-/Postoutcome Measures 


Social Skills Improvement System (SSiS) rating scales. Teacher-reported and parent-reported SSiS 
social skills and problem behavior scales were the primary outcome measures for this study. This 
instrument is designed to assess progress in these skills over time. The social skills scale assesses 
behaviors that promote positive interactions and minimize negative interactions with adults and 
peers, whereas the problem behavior scale assesses behaviors that impede prosocial behavior 
(Gresham & Elliott, 2008). The social skills scale has 46 items for teacher-reported (a = .93) and 
parent-reported versions (a = .95). The problem behavior scale includes 30 items for teacher- 
reported (a = .89) and 33 items for parent-reported (a = .92) versions. Items are reported on a 
4-point frequency scale (1.e., never, seldom, often, almost always). We converted raw scale scores 
to standard scores using gender-specific normative data from the SSiS manual. 


ESP Scales:ABI, MBI, and ABS. Stage 2 ESP subscales were used as secondary outcome measures in 
this study (Feil & Becker, 1993; Feil et al., 1998). The ABS has nine items (a = .79) measuring the 
frequency of aggressive behavior. The ABI (a = .77) and MBI (a = .81) have eight and nine items, 
respectively. These indices assess the child’s teacher-related and peer-to-peer behavioral adjust- 
ments. All three ESP measures are rated on a 5-point frequency scale ranging from never to 
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frequently. Raw scale scores were computed for each measure. While the ESP was developed 
originally as a screening measure, Stage 2 subscales from the ESP have also been used as outcome 
measures in other research studies with preschool children (Gunn, Feil, Seeley, Severson, & 
Walker, 2006; Serna, Nielsen, Lambros, & Forness, 2000; Sumi et al., 2013; Walker et al., 2009) 
and have been shown to be sensitive to change as well as having robust concurrent validity. The 
MBI and ABS, for example, have been strongly correlated with established measures such as the 
Teacher Report Form of the Achenbach System of Empirically Based Assessments. The External- 
izing subscale of the Achenbach, for example, was correlated with the MBI and ABS, with Pear- 
son’s rs of .88 (p < .001) and .83 (p < .001), respectively (Feil, Walker, & Severson, 1995). 

To facilitate interpretation and discussion, we grouped outcome measures into two domains: 
prosocial behavior and problem behavior. The prosocial behavior domain includes three scales: 
the ABI and teacher- and parent-reported social skills scales (/ intercorrelation = .29). The prob- 
lem behavior domain includes four scales: the ABS, MBI, and teacher- and parent-reported prob- 
lem behavior scales (M intercorrelation = .41). 


Process Measures 


Project staff collected implementation fidelity data, measures of teacher-coach alliance, esti- 
mates of child and parent compliance, measures of parent fidelity and dosage, and satisfaction 
data from participants assigned to the intervention condition to determine whether (a) coaches 
and teachers implemented the program as intended; (b) teachers and coaches were satisfied with 
their working relationship as it pertained to program implementation; (c) children and parents 
complied with program requirements; (d) parents were involved and engaged in the homeBase 
component of the program; and (e) teachers and parents were satisfied with program implemen- 
tation, support, and outcomes. Descriptions of each measure follow. 


Implementation Fidelity Checklist (IFC). The IFC is a measure adapted to the preschool setting from 
Walker et al. (2009) to assess delivery and implementation quality of the preschool classroom com- 
ponent. The IFC assesses 16 implementation tasks such as whether the implementer elicits coopera- 
tion from the entire class, informs the class of the activity reward, gives the target student points 
when prompted, provides positive feedback to the target student during the green card game, uses 
verbal reminders, redirects or makes other comments to prompt the student, and records the day’s 
results on the classroom monitoring form (CMF). For each item, the fidelity checklist assesses (a) 
delivery of the component and (b) quality of delivery using a 5-point scale from 0 = very poor, 
.25 = poor, .50 = okay, .75 = good, to 1.0 = excellent (a = .89). Observers collected data on three 
occasions: once during the coach phase and twice during the teacher phase. Interrater reliability 
collected on 20% of the fidelity checks conducted was acceptable, Intraclass (ICC) (3,1) = .82. 
These data were used to compute adherence and implementation quality scores for the coach, 
teacher, and overall classroom. Adherence scores represent the proportion of critical program fea- 
tures implemented by the coach and teacher. The mean of teacher and coach adherence scores was 
calculated as a measure of overall classroom adherence. Average quality ratings for the coach and 
teacher were calculated as measures of implementation quality; the combined scores for the two 
implementers were used as a measure of overall classroom implementation quality. 


CLASS Monitoring Form (CMF). The CMF (Walker et al., 2009) is used by the PFS coach and teacher 
to track the child’s compliance with daily goals during the 30 program days of the classroom com- 
ponent. On the CMF, the teacher records daily (a) the number of points required to meet the reward 
criterion, (b) the number of points earned, and (c) whether the focus child met criterion or a recycle 
day was necessary (i.e., the child did not meet the daily criterion and the program day was repeated). 
In accordance with previous studies (Sumi et al., 2013; Walker et al., 2009), we calculated 
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classroom component dosage as the proportion of program days successfully completed (out of 30) 
and child compliance as the proportion of successful program days to total program days. Com- 
puted dosage and compliance scores ranged from 0 to 1. For example, a child who completed 30 
program days without recycling would have a compliance score of 1.0; whereas a child who com- 
pleted 30 days but recycled 7 days would have a compliance score of .81 (i.e., 30/37). 


homeBase Monitoring Form (HMF). Coaches completed the HMF (Walker et al., 2009) after each 
homeBase session to record (a) whether a session was completed, (b) the parent’s estimated level 
of fidelity, and (c) whether the parent completed the weekly homework assignment. The HMF 
data were used to compute measures of homeBase dosage, parent fidelity, and parent compliance. 
Program dosage was computed as the proportion of treatment units (i.e., homeBase sessions) 
delivered. Scores ranged from 0 (no sessions delivered) to 1 (all sessions delivered). The coach 
rated parent fidelity on a 3-point scale: high (1), medium (0.5), and low (0). If high, parents par- 
ticipated in and implemented all procedures effectively. If medium, parents demonstrated moder- 
ate levels of skill and enthusiasm. If low, parents exhibited limited skill, interest, and cooperation. 
Parent fidelity was calculated as the mean fidelity score across the completed sessions. Scores 
ranged from 0 to 1, with higher scores indicating greater levels of fidelity. At the end of home- 
Base Session | and each subsequent session, parents were assigned a brief homework activity 
corresponding with the week’s topic. We calculated parent compliance as the proportion of 
homework assignments completed across sessions (range = 0-1). 


Alliance survey. After completing the classroom component, the coach and teacher also responded 
a 10-item alliance measure (Walker et al., 2009) to assess their partnership as it related to program 
implementation. Coefficient alpha for this scale is excellent for the coach version (a = .94) and 
teacher version (a = .95). The survey evaluates aspects of alliance such as the respondent’s percep- 
tion of their partner’s approachability, shared goals, communication skills, willingness to collabo- 
rate, and overall effectiveness. Alliance items were rated on a 5-point scale ranging from never to 
always. The total alliance score for each informant is the mean rating across the 10 items. Thus, 
the score range is from 0 to 5 with higher scores indicating higher mean alliance ratings. 


Satisfaction survey. We collected teacher and parent satisfaction data after completion of the 
school and home components, respectively. The teacher satisfaction report is a 13-item measure 
(a = .91), scored on a 5-point Likert-type scale from strongly disagree to strongly agree. The 
survey assesses the teacher’s perception of training and support received, program usability, and 
program effectiveness. The parent satisfaction report is scaled in the same manner as teacher 
satisfaction. The report includes 12 items (a = .94) examining the parent’s perceptions of pro- 
gram usability, effectiveness, and value based on impact within the home setting. Both scales 
have been used in previous studies of First Step (Sumi et al., 2013; Walker et al., 2009). For each 
measure, we calculated a mean rating across items to assess program satisfaction. Scores ranged 
from 0 to 5 with higher scores indicating higher levels of satisfaction. 


Analysis 


Using Mplus 6.0 statistical software (Muthén & Muthén, 1998-2010), we estimated a series of 
linear regression models. Each outcome was regressed on one predictor and one covariate: a 
dichotomous predictor indicating intervention condition (1 = First Step intervention, 0 = usual- 
care control) and the baseline value of the outcome. We centered the baseline value of the outcome 
(i.e., the sample mean was subtracted from each observed value) to facilitate interpretability and 
calculation of covariate-adjusted, postintervention means. For each outcome, we estimated three 
preliminary models. One model included the predictor, the covariate, and an interaction term (i.e., 
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Intervention condition x Baseline value of the outcome) to test the equivalence of the slopes of the 
regression lines for each group. We tested whether the site (i.e., Oregon and Kentucky) moderated 
program effects by including an interaction term between intervention condition and site. Finally, 
we tested for cohort effects by creating two dummy-coded variables and testing for a cohort by 
condition interaction effect for each model. If nonsignificant, we removed these interaction terms 
from the model to estimate the main effect of the program condition for each outcome. 

We used the robust maximum likelihood (RML) estimator in Mplus 6.0 to address missing data 
in the regression models. Maximum likelihood estimation uses all available data to calculate unbi- 
ased parameter estimates and standard errors and is considered a state-of-the-art technique for 
handling missing data (Schafer & Graham, 2002). To improve the accuracy of RML estimation, 
we included eight auxiliary variables in the models as potential correlates of missingness: child’s 
ESP rank, child’s sex, Spanish-speaking parent, current marital status, parent’s education level, 
estimated annual household income, number of children in the household, and parental distress as 
reported on the Parenting Stress Index—Short Form (PSI-SF; Abidin, 1995) via questionnaire. 
Given the higher rate of missing data among parent informants, we included auxiliary variables in 
the models which have been shown to be predictive of subsequent dropout (Beauchaine, Webster- 
Stratton, & Reid, 2005; Herman et al., 2012; Reinke et al., 2012) and which indicate higher levels 
of familial stress or might be perceived as potential impediments between families and research 
staff (i.e., Spanish-speaking participants). Inclusion of auxiliary variables is recommended as part 
of an inclusive analysis strategy because potential correlates of missingness increase statistical 
power, reduce bias, and strengthen the missing at random assumption without altering the inter- 
pretation of parameter estimates (Collins, Schafer, & Kam, 2001; Enders, 2010). 

As a measure of effect size, we report Hedges’ g, which the What Works Clearinghouse 
(WWC) recommends as the preferred measure of effect size for continuous outcomes. Hedges’s 
g, the standardized mean difference, is calculated by taking the difference between the mean 
outcome of each group and dividing it by the pooled within-group standard deviation (WWC, 
2011). Effect sizes of .2, .5, and .8 are considered small, medium, and large effects, respectively. 
To correct for multiple comparisons, we applied the Benjamini—Hochberg (B-H) correction to 
statistically significant outcomes (Benjamini & Hochberg, 1995). To calculate a B-H correction, 
statistically significant outcomes are ranked in ascending order within domains based on p values 
and a cutoff for each is calculated. For the prosocial behavior domain, which contains three out- 
comes, rank-ordered intervention effects are considered significant at a .05 alpha level if p values 
are less than .017, .033, and .05, respectively. For the problem behavior domain, which includes 
four outcomes, rank-ordered intervention effects are significant at a .05 alpha level ifp values are 
less than .013, .025, .038, and .05. 

In addition, we report the WWC (2011) improvement index as a measure of practical signifi- 
cance. The improvement index is calculated through a two-step process. First, the effect size 
estimate is converted to a Cohen’s U3 index using a standard normal distribution z-score table. 
Then, the U3 index, which represents the percentile rank of an average child from the First Step 
intervention condition in the distribution of the control condition, is subtracted from 50%, the 
percentile rank of an average child in the control condition. The WWC improvement index can 
be interpreted as the expected change in percentile rank for an average control group child if the 
child had received the PFS intervention. 


Results 


Baseline Equivalence 


To evaluate the equivalency of the cohorts as well as intervention and control conditions at base- 
line, we examined between-group differences on the seven outcome measures at baseline and the 
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Table 2. Descriptive Statistics for Process Measures. 


Classroom, M (SD) homeBase, M (SD) 

Measure Coach Teacher Combined Parent Overall 

Protocol adherence 0.95 (0.07) 0.95 (0.09) 0.95 (0.07) — 0.95 (0.07) 
Quality of implementation 0.92 (0.06) 0.78(0.15) 0.85 (0.09) 0.60 (0.35) 0.77 (0.14) 
Dosage — — 0.92 (0.19) 0.84 (0.26) 0.88 (0.17) 
Compliance — — 0.88 (0.15) 0.59 (0.40) 0.78 (0.22) 
Alliance 4.34 (0.63) 4.83 (0.40) — — 4.59 (0.42) 
Satisfaction — 4.36 (0.54) — 4.34 (0.65) 4.35 (0.47) 


equivalence of the two groups on child, parent, and teacher baseline demographics. Child base- 
line and demographic characteristics are reported by condition in Table 1. The First Step inter- 
vention group and business-as-usual group did not differ significantly on parent demographic 
measures including percent living in intact household (27% vs. 26%), number of children in the 
household, M (SD) = 2.3 (1.2) versus 2.5 (1.3), percent with a bachelor’s degree or higher (13% 
vs. 11%), or levels of parental distress, M (SD) = 24.8 (9.9) versus 26.7 (12.0). There was also no 
difference between groups on teacher and classroom characteristics including the percent of 
teachers with a bachelor’s degree or higher (36% vs. 45%), the number of years teaching, 
M (SD) = 12.8 (8.8) versus 16.0 (9.6), and the number of early childhood personnel in the 
classroom, M (SD) = 2.3 (1.0) versus 2.3 (1.7). As can be seen from this table, the two groups 
differed on one demographic variable, the percentage of Hispanic/Latino children. There were a 
larger percentage of Hispanic/Latino children randomized to the control condition than to the 
intervention condition (23% vs. 7%, respectively). In addition, we examined cohort effects and 
all results were nonsignificant. In other words, all three cohorts were equivalent at baseline. 


Attrition and Missing Data 


Of the 126 participating classrooms, project staff collected baseline packets from 125 teachers 
(99%) and 120 parents (95%). At postintervention, 124 teachers (98%) and 114 parents (91%) 
returned a questionnaire. At the scale-level, missing data rates ranged from 1% to 4% for teacher- 
reported outcomes and from 7% to 9% for parent-reported outcomes. At postintervention the 
percent of missing data on teacher-reported outcomes ranged from 2% to 5%. For parent-reported 
outcomes at postintervention, data were missing for 10% of the sample. 

To test the assumption that data were missing completely at random (MCAR), we used a two- 
step approach. We first examined patterns of missing data and Little’s MCAR test, a global test 
of MCAR. Then, given that Little’s test has low power and is susceptible to Type II errors (Enders, 
2010), we conducted univariate ¢ tests for continuous variables and contingency table analysis for 
categorical variables to examine whether, for each outcome, cases with missing data differed 
from those without missing data on other relevant variables including program condition, child 
and parent demographics, and baseline values on screening and outcome measures. Little’s 
MCAR test was nonsignificant (7? = 212.45, n = 126, p = .200) and none of the examined vari- 
ables were significantly associated with missing data groups, suggesting that data were MCAR. 


Fidelity, Program Compliance, Alliance, and Satisfaction 


Table 2 summarizes descriptive statistics for the process measures collected for students, parents, 
teachers, and coaches assigned to the intervention condition. Adherence to core protocol compo- 
nents of the program was excellent during both coach (95%) and teacher (95%) phases of the 
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study. The quality of the classroom-based component implementation was excellent during the 
coach phase (M = 0.92; range = 0.77-1.00) and good during the teacher phase (M = 0.78; range = 
0.41-1.00). On average, students received 88% of the requisite program days and families 
received 84%, on average, of homeBase sessions. During the classroom-based component, stu- 
dent compliance, on average, was excellent (.88); whereas during homeBase, parent compliance 
and fidelity were in the moderate range (.59 and .60, respectively). Therapeutic alliance was rated 
highly by both coaches (M = 4.34 on a 5-point scale) and teachers (M = 4.83). As well, parent- 
and teacher-reported satisfaction ratings were favorable. Mean teacher-reported scores were 4.36 
and mean parent-reported were 4.34 on a 5-point scale. Mean item-level ratings were above 4.0 
for 12 of 13 teacher-reported items and 11 of 12 parent-reported items. For both informants, the 
only item rated below 4.0 pertained to the amount of time spent implementing the program 
(teacher: M = 3.93; parent: M = 3.95). 


Posttest Differences on Outcome Measures 


Preliminary models examined whether site or cohort moderated program effects were nonsig- 
nificant. For all models the slopes of the regression lines were equivalent for both experimental 
and control conditions. Table 3 summarizes baseline and posttest intervention means and stan- 
dard deviations for the intervention and control conditions, as well as results from the covariate- 
adjusted regression models. For the prosocial behavior domain, the intervention group differed 
from the control group on the three parent- and teacher-reported outcomes, with informants 
reporting statistically significant improvement at posttest in the prosocial functioning of chil- 
dren receiving PFS as compared with children in the control condition. Hedges’s g effect sizes 
for the three prosocial outcomes ranged from .29 to .88. Across the four outcomes in the prob- 
lem behavior domain, children who received the PFS intervention had significant reductions in 
problem behavior across both school and home settings as compared with children who did not 
receive the program. The Hedges’s g effect sizes for the four outcomes ranged from —.45 to 
—.79. The B-H correction when applied to the three outcomes in the prosocial domain requires 
that the rank-ordered, statistically significant outcomes remain significant at the .05 level if p 
values are less than .017, .033, and .05. For the four outcomes in the problem behavior domain, 
rank-ordered, statistically significant outcomes remained significant at the .05 level and p val- 
ues were less than .013, .025, .038, and .05, respectively. According to these criteria, all 7 out- 
comes remained statistically significant at the .05 level after applying the B-H adjustment. 


Practical Significance of Preschool First Step Intervention Effects 


We calculated an improvement index for each outcome to evaluate the practical significance of the 
PFS program on changes in child behavior. That is, we estimated the mean improvement if an aver- 
age child from the control group had received the intervention. With respect to the prosocial behavior 
domain, the mean improvement index score was +23 percentile points (i.e., an average control stu- 
dent receiving the intervention would be predicted to have a mean improvement of 23% on social 
skills outcomes). Scores on the teacher-reported ABI and SSiS social skills scale were +31 and +28 
percentile points, respectively. The improvement index score for parent-reported social skills was 
+11 percentile points. Similarly, the mean improvement index score for the problem behavior domain 
was +24 percentile points. Teacher-reported problem behavior outcomes—the MBI, ABS, and SSiS 
problem behavior scale—ranged from +26 to +29 percentile points. For the parent-reported SSiS 
problem behavior scale, the improvement index score was +17 percentile points. There were positive 
improvements on all outcomes and across both the school and home setting; however, greater mean 
improvement was reported across domains in the school setting (+28 percentile points) as compared 
with mean improvements in the home setting (+14 percentile points). 
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Discussion 


Considerable progress has been made in the past decade in developing and promoting evidence- 
based interventions designed to prevent or reduce existing behavior problems among preschool- 
age children (Barnett et al., 2006; Daley, Jones, Hutchings, & Thompson, 2009; Dunlap & Fox, 
2011; Forness et al., 2000; LaForett, Murray, & Kollins, 2008; McCabe & Altamura, 2011). The 
present research study extends the possible options for efficacious interventions to reduce pre- 
schoolers’ problem behavior and improve their social skills. The outcomes from this randomized 
controlled trial evaluation of the PFS intervention showed significant improvements in prosocial 
behavior as well as significant decreases in problem behavior. Our fidelity results also indicate 
that we affected the classroom environment for the target child and possibly his or her status with 
peers in ways that appeared to increase important positive child outcomes (i.e., parent- and 
teacher-reported social skills). 

This study builds on the considerable body of evidence documenting the efficacy of the origi- 
nal First Step intervention (see Loman, Rodriguez, & Horner, 2010; Seeley et al., 2009; Sumi 
et al., 2013; Walker et al., 2009; Walker et al., 2014) by replicating these findings with preschool 
children, their parents, and their teachers. It also lends empirical support for the preschool ver- 
sion of the intervention. Specifically, in the prosocial behavior domain, the intervention group 
differed from the control group on all three parent- and teacher-reported outcomes with infor- 
mants reporting statistically significant improvement at posttest in the prosocial functioning of 
children receiving PFS as compared with children in the control condition. Our fidelity and par- 
ent/teacher satisfaction results replicated our previous findings with preschoolers (Feil et al., 
2009; Frey et al., 2013; Gunn et al., 2006). There seems to be a substantial foundation of support 
for the use of First Step to Success with preschool-age children and their families. 

However, this study was not without limitations, and additional research is needed to expand 
our understanding of the impact of the PFS intervention. Although we were able to detect inter- 
vention results in parent and teacher ratings, we did not show effects on direct observations of 
child behavior; and this should be a priority for further study and documentation. We recognize 
that relying only on participant ratings of child behavior limits the conclusions that can be drawn 
from these results. That is, there is the potential for participant bias, thereby possibly inflating the 
achieved effects (Hoyt, 2002). Gresham and Elliott (2014), however, while acknowledging direct 
in vivo observations as a gold standard in behavioral research, have reviewed the pros and cons 
in this regard and concluded that “rating scale technology today represents one of the primary 
and most efficient methods used by researchers to describe and categorize children’s behavior 
and attitudes and identify target behaviors in need of intervention” (pp. 158-159). In addition, we 
should note here that we used only subtests from two rating instruments for outcome measures, 
and one was not originally designed as an outcome measure. Nevertheless, the study’s outcomes 
would have been more robust and generalizable if we had (a) been able to demonstrate positive 
effects on direct observation measures and (b) teachers and parents had not been aware of which 
children were the focus of the PFS intervention. 

In the current study, we also only report short-term pre-/postoutcomes. Future studies should 
therefore not only include replication(s) of these findings but should also examine longer term 
outcomes for maintenance effects persisting beyond the intervention year with follow-up into 
kindergarten and the primary grades (Flay et al., 2005). Given also that this was an efficacy trial, 
the behavioral coaches who implemented PFS with teachers and families were part of our core 
research team and had worked previously in the preschool field as behavioral consultants or 
teachers. Feldstein and Glasgow (2008) reported that utilizing a “bridge researcher,” as in this 
case, facilitated the extension and use of empirically supported programs beyond the confines of 
the immediate research. In subsequent work, we plan to hire behavioral coaches whose profiles 
and characteristics more closely resemble those of endogenous providers who would likely 
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implement the intervention in a real-world setting and also assess the durability of intervention 
effects within and across these contexts. 

It was not possible to tease out which component(s) of the PFS intervention were particularly 
responsible for the outcomes of this study. However, our green/red card cue system and the target 
child’s opportunity to earn rewards for his or her entire class appear to be somewhat unique fea- 
tures as compared with most other two-tier programs mentioned above. As noted by Forness, 
Walker, and Serna (2014), a recent trend has emerged, especially in pharmacologic school-based, 
effectiveness research, in which a new intervention is directly compared with the next best exist- 
ing intervention but simultaneously within the same study. Such an approach, despite its complex- 
ity and possible logistical difficulties, would provide essential information in assisting professionals 
to choose evidence-based practices that might better meet the needs of their preschool children. 

Although research in the area of prevention and intervention of behavior problems in pre- 
schoolers is limited, there are encouraging signs that coordinated adoption of validated practices 
could substantially reduce preschoolers’ challenging behaviors and thereby enhance the social 
and emotional well-being of at-risk preschoolers (Powell, Dunlap, & Fox, 2006). This study 
shows that we can successfully replicate PFS intervention effects across geographically and cul- 
turally different implementation sites with powerful effects in terms of child outcomes and con- 
sumer satisfaction. Implementation of the preschool adaptation of the First Step to Success 
intervention program could assist early childhood educators in achieving the goal of greater 
adoption of empirically validated interventions. In comparison with other interventions noted 
above, the PFS intervention is easy to use for teachers, with a tool that is simple and accessible 
(i.e., green and red construction paper glued together), guidelines for implementation that are 
straightforward, and the provision of consultation with behavioral coach inherent throughout the 
program implementation. This study has shown that PFS has the potential to help early childhood 
professionals mitigate problem behavior among high-risk children and further promote positive 
classroom ecologies in preschool settings. 
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